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Joint Channel-and-Data Estimation for 
Large-MIMO Systems with Low-Precision ADCs 

Chao-Kai Wen, Shi Jin, Kai-Kit Wong, Chang-Jen Wang, and Gang Wu 


Abstract —The use of low precision (e.g., 13 bits) analog-to- 
digital convenors (ADCs) in very large multiple-input multiple- 
output (MIMO) systems is a technique to reduce cost and power 
consumption. In this context, nevertheless, it has been shown that 
the training duration is required to be very large just to obtain 
an acceptable channel state information (CSI) at the receiver. A 
possible solution to the quantized MIMO systems is joint channel- 
and-data (JCD) estimation. This paper first develops an analytical 
framework for studying the quantized MIMO system using JCD 
estimation. In particular, we use the Bayes-optimal inference for 
the JCD estimation and realize this estimator utilizing a recent 
technique based on approximate message passing. Large-system 
analysis based on the replica method is then adopted to derive 
the asymptotic performances of the JCD estimator. Results from 
simulations confirm our theoretical findings and reveal that the 
JCD estimator can provide a significant gain over conventional 
pilot-only schemes in the quantized MIMO system. 

I. Introduction 

Very large multiple-input multiple-output (MIMO) or “mas¬ 
sive MIMO” systems [1] are widely considered as a key 
technology for 5G wireless communications networks [2,3]. 
Such systems promote the use of a very large number of 
antennas at the base station (BS) (e.g., hundreds or thousands) 
to serve a number of user terminals (e.g., tens or hundreds) 
in the same time-frequency resource. Nonetheless, the high 
dimensionality greatly increases hardware cost and power 
consumption. This motivates the study of MIMO systems with 
very low precision (e.g., 1 — 3 bits) analog-to-digital convenors 
(ADCs). 

Several aspects of low precision ADCs have been studied in 
the literature for single-input single-output (SISO) channels [4] 
and more recently MIMO channels [5] and references therein. 
In this paper, our focus is on signal detection for the quantized 
MIMO systems where each receiving antenna is equipped with 
a very low precision ADC. Prior work in this direction covered 
code-division multiple-access (CDMA) systems [6], massive 
MIMO systems [5-9], distributed antenna systems (DASs) 
[10], and compressed sensing [11]. However, most previous 
work assumed perfect channel state information at the receiver 
(CSIR). In particular, [8] revealed that in a MIMO system with 
one-bit ADC, to achieve the same performance as the full CSI 
case we have to use a very long training sequence (above 50 
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Fig. 1. The quantized MIMO system. 


times the number of users). Clearly, the assumption of perfect 
CSIR becomes quite controversial particularly for quantized 
MIMO systems. The requirement of long training sequence 
motivates us to consider joint channel-and-data (JCD) esti¬ 
mation in which estimated payload data are utilized to aid 
channel estimation. A major advantage of JCD estimation is 
that relatively few pilot symbols are required to achieve the 
equalization channel and data estimation performances. 

Although performance enhancement by using this technique 
is expected, we are not aware of any study for the quantized 
MIMO systems using JCD estimation. 1 In this paper, we take 
the important first step to analyze the achievable performance 
of the quantized MIMO system using JCD estimation. To this 
end, we use the Bayes-optimal inference for JCD estimation as 
this approach provides the minimal mean-square-error (MSE) 
with respect to (w.r.t.) the channels and payload data. How¬ 
ever, the complexity for carrying out the Bayes-optimal JCD 
estimator appears prohibitive. To address this issue, we use a 
variant version of belief propagation (BP) to approximate the 
marginal distributions of each data and channel components. 
In particular, we modify the bilinear generalized approximate 
message passing (BiG-AMP) scheme in [15] and adapt it to 
the quantized MIMO system. We refer to the proposed method 
as GAMP-based JCD. Furthermore, by applying large-system 
analysis based on the replica method from statistical physics, 
we provide the theoretical performances for the Bayes-optimal 
JCD estimator. 2 Simulations are used to verify the efficiency 
of the proposed algorithm and the accuracy of our analysis. 

1 In the context of unquantized MIMO system, several aspects of the JCD 
estimation have already been widely studied, see e.g., [12-14], 

-In this paper, the Bayes-optimal JCD estimator is regarded as the theoret¬ 
ical optimal estimator, while the GAMP-based JCD algorithm can be thought 
of as a practical algorithm to approximate the theoretical optimal estimator. 







II. System Model 


We consider a block-fading uplink channel with K trans¬ 
mit antennas and N receive antennas, in which the channel 
remains constant over T consecutive symbol-intervals (i.e., a 
block). The received signal Y £ C NxT over the block interval 
can be written in matrix form as 


Y = -LhX + W = Z + W, 
Vk 


(i) 


where X £ C KxT denotes the transmit symbols in the block, 
H e C NxK is the matrix containing the fading coefficients 
associated to the channels between the transmit antennas and 
the receive antennas, W £ C NxT represents the additive 
temporally and spatially white Gaussian noise with zero mean 
and element-wise variance <x 2 , and we define Z = —i=HX. 

In the quantized MIMO system, as shown in Figure 1, each 
received signal component Y K Y 1 < i < N, 1 < j < T are 
quantized separately into a finite set of prescribed values by a 
B-bit quantizer Q c . The resulting quantized signals can read 


Y = Q c (Z + W). (2) 


Specifically, each complex-valued quantizer Q c (-) is defined 
as Y 13 = Q C (YD) = Q(Re{Y^}) + jQ(Im{Y^'}), i.e., the 
real and imaginary parts are quantized separately. The real¬ 
valued quantizer Q maps a real-valued input to one of the 2 B 
bins, which are characterized by the set of 2 B — 1 thresholds 
[r i, r2, .. •, r 2 B_ 1 ], such that — oo < r\ < r2 < ■ ■ ■ < t 2 b_i < 
oo. For notational consistence, we define ro = — oo and 
t 2 b = oo. The output Y is assigned a value in (rb-i, rb] when 
the quantizer input Y falls in the interval r/,] (namely, 

the 6-th bin). For example, the threshold of a typical uniform 
quantizer with the quantization step-size A is given by 

r b = (~2 B ~ 1 +b)A, for 6 = 1,..., 2 B — 1, (3) 

and the quantization output is assigned the value r b — ^ when 
the input falls in the 6-th bin. Figure 1 shows an example of 
the 3-bit uniform quantizer. 

Since the channel matrix H needs to be estimated at the 
receiver, we make the first T) symbols of the block of T 
symbols serve as pilot sequences. The remaining T 2 = T — T) 
symbols are used for data transmissions. This setting is equiv¬ 
alent to partitioning X as X = [Xi X 2 ] with Xi £ C KxTl 
and X 2 £ C K x T ' 2 . The training and data phases are referred 
to as t-phase and d-phase, respectively. We assume that the 
matrix X | (or X 2 ) is composed of independent and identically 
distributed (i.i.d.) random variables Xi (X 2 ) generated from a 
known probability distribution Px, (or Px 2 ), he., 


Px(X) = P Xl (X 1 )Px 2 (X 2 ) (4) 

with p Xl (X!) = Uij Px^xn, p X2 (x 2 ) = n hj P X2 (X^). 

Since the pilot and data symbols should appear on constel¬ 
lation points uniformly, the ensemble averages of { X \ :l } and 
{ X ' 2 '} are assumed to be zero. In addition, we let a Xi and 
<t 2 2 be the transmit powers during the t-phase and d-phase, 
respectively, i.e., E{jX I /| 2 } = (t 2 Xi and E{|X!f| 2 } = tr 2 2 . 


Similarly, we assume that each entry H lJ is drawn from the 
complex Gaussian distribution A/c(0,<r 2 ), where a b indicates 
the large scale fading factor. 3 Let = A/"c(0, cr 2 ). Then 

we have 

Ph(H)=IJPh (H'i). (5) 

ij 

III. JCD Estimation 


Our focus is on the setting where the receiver knows the 
distributions of H and X but not their realizations. In the 
conventional pilot-only scheme, the receiver first uses Yi and 
Xi to generate an estimate of H and then uses the estimated 
channel for estimating the data X 2 from Y 2 [8], In contrast 
to the pilot-only scheme, we consider JCD estimation, where 
the BS estimates both H and X 2 from Y given Xi. We treat 
the problem in the framework of Bayesian inference [16]. 

To this end, we first define the likelihood, which is the 
distribution of the received signals under (2) conditional on 
the unknown parameters, as: 


N T 


P(Y|H,X) = [][]P out ()•'■' 


i=i i=i 


(6) 


where 


(y| Z ) 


y/l\ 


C r b (y-Ke(Z )) 2 

dye ^ 


J w J r b -1 




7T(J; 


(y-Im(Z )) 2 \ 

dye (7) 


W " ' b'-l 


as Re(Y) and Im(Y) fall in the 6-th and the 6'-th bins, 
respectively, i.e, Re(Y) = r b and Im(Y) = r b '- Let = 
D z with D z = dz. Then (7) has the following 

closed-form expression 


(y|z) =T- b (Re(Z))^(lm(Z)), 


where 


^b(x) = $ 


\f2(r b - x ) 


- $ 


V2(r b -i - x) 


( 8 ) 


• (9) 


The prior distributions of H and X are given by (5) and (4), 
respectively. Then the posterior probability can be computed 
according to Bayes’ rule: 


P(H,X|Y) 


P(Y|H,X)Ph(H)P x (X) 

P(Y) 


( 10 ) 


where P(Y) = f H f x dHdXP(Y|H, X)P h (H)P x (X) is the 
marginal likelihood. 

Given the posterior probability, an estimate for X 13 can be 
obtained by the posterior mean 


X ij = J d.YWXLY'M.Y'L 


( 11 ) 


3 For ease of notation, we consider the case where all the transmits have 
the same large-scale fading factor but our results can be easily extended. 



where 


0>(X ij ) = [ [ clHdX P(H, X|Y) (12) 

is the marginal posterior probability of X". Here, the notation 
fx\ x*o dX denotes the integration over all the variables in X 
except for X IJ . If the MSE of an estimate X w.r.t. X is defined 
as 


mse(X t |Y)= f [ dHdXPQH, X|Y)||X t - X t || 2 , (13) 

J H JX 

for t = 1, 2, then the posterior mean estimator (11) gives the 
minimum MSE (MMSE) [16], Notice that given a known pilot 
matrix Xi, he.. Pxi (Xi) = <5(Xi — X-,), we can easily obtain 
X\ 2 = ]C\ from (11). In this case, we have mse(Xi|Y) = 0. 
Similarly, the Bayes estimate of I I' J is given by 


II" = J dU".P(H" )U" (14) 

where = J HXHij f x dHdXP(H, X|Y) denotes the 

marginal posterior probability of H". The estimate II also 
minimizes the MSE 

mse(H|Y)= f [ dHdX P(H, X|Y)||H - H|||. (15) 

Jh ix 

Hereafter, we will refer to (11) and (14) as the Bayes-optimal 
estimator. 

Although the Bayes-optimal estimator provides the MMSE 
estimates, direct computations of (11) and (14) are intractable 
due to high-denominational integrals involved in the marginal 
posteriors £P(X") and In [17], BP provides a practi¬ 

cal alternative to approximate these marginal posteriors. In the 
recent compressed sensing literature, the Bayesian framework 
in combination with a BP algorithm has given rise to the so- 
called approximate message passing (AMP) algorithm [18] 
and the generalized AMP (GAMP) [19], Applying this devel¬ 
opment to our context of the MIMO system means that when 
H is perfectly known, GAMP can provide a tractable way to 
approximate the marginal posteriors ^(X'^’s. Remarkably, it 
has proved that the approximations become exact in the large- 
system limit for dense matrix H with sub-Gaussian entries. 
More recently, Parker el al. in [15] applied the same strategy of 
GAMP to the problem of reconstructing matrices from bilinear 
noisy observations (i.e., reconstructing H and X from Y), 
which is referred to as bilinear GAMP (BiG-AMP). The BiG- 
AMP scheme can be applied to tackle the Bayes-optimal JCD 
estimator and we can adapt it to be used in the quantized 
MIMO setting. We call the developed algorithm GAMP-based 
JCD algorithm. Due to space limitations, we remove details of 
the algorithm development in this paper while we will show 
its simulation performances later in Section V. 


IV. Performance Analysis 

knowing the theoretical lower bound of the estimate is 
useful to assess any developed algorithm. Therefore, in this 
section, our objective is to derive analytical results for the aver¬ 
age MSEs of X 2 and H for the Bayes-optimal JCD estimator. 


i.e., msex t — E{mse(X t |Y)} and mse# = E{mse(H|Y)}. 
Our analysis investigates the high-dimensional regime where 
N, K,T -A 00 but the ratios N/K = a, T/I\ = /?, 
T t /K = f3 t , f° r t = 1,2 are fixed and finite. For convenience, 
we simply use AT —>■ 00 (or refer to as the large-system limit) 
to denote this high-dimension limit. Following the argument 
of [20,21], it can be shown that msex f and mse# are saddle 
points of the average free entropy 

$4 JL E ^{logP(Y)}, (16) 

where P(Y) denotes the marginal likelihood in (10), namely 
the partition function. The major difficulty in computing (16) is 
the expectation of the logarithm of P(Y), which, nevertheless, 
can be facilitated by rewriting $ as [22] 

® = ^s£ logE *{ pT <*>}- (17) 

The expectation operator is moved inside the log-function. 
We first evaluate Ey{P t (Y)} for an integer-valued r, and 
then generalize the result to any positive real number r. This 
technique is called the replica method, and has been widely 
adopted in the field of statistical physics [22] and information 
theory literature, e.g., [12,23-29], Under the assumption of 
replica symmetry (RS), the following results are obtained. 


Proposition 1: As AT -A 00 , the asymptotic MSEs w.r.t. X f 
and H are associated with the MSEs for the scalar Gaussian 
channels: 


Y <lu = ^/q H H + W„, (18) 

Yq Xt = V^ x t + W x , (19) 

where Wu . Wx ~ A/c(0,1) are independent of II ~ Pu and 
Xt ~ Px t ■ Here, the parameters fjn and (fx, are the solutions 
to the set of fixed-point equations 


Qh = PtPXiXt, 
t —1 

(fx, = aq H Xt, 
qH = ch - mse H , 
qx t = Cx, - msex t . 


(20a) 

(20b) 

(20c) 

(20d) 


where mse# = E{|iT— E{H\Yq H }\ 2 } and msex t = E{|X t — 
E{X t ]Yq Xt } | 2 } are the asymptotic MSEs w.r.t. X t and H, 
respectively, c Xt = E{|X t | 2 } = cr 2 t , c H = E{|i7| 2 } = a 2 h , 
and 


2—1 


b=l 


xt=J2 

with 

MVt) = $ 


Dw 


(*6 ) 
^6 (• s/qnqx t v ) 

\/ 2 n -Vt \ 


( 21 ) 


\J + c H c Xt - qnq Xt J 
/ V2r b -1 - Vt 


- $ 


V a w + 


CHC Xt — qHqx t 


( 22 ) 



and 


<(Vt) = 


d^ b {V t ) 

dV t 


( V2r h -V t ) 2 <.V2r b _ 1 -V t ) 2 

2 (.‘ 7 i,+ c H c X t -1H1X t ) _ g 2 ( CT S, + c Jf c X t -IH 9X t ) 

— - (23) 


+ c -H c ^t — QHQx t ) 


In the t-phase, i.e., t = 1, the pilot matrix X, is known. 
Thus, we substitute mse^ = 0 into the above expressions. 

Proof: An outline proof is given in the appendix. ■ 
The above result reveals that in the large-system limit, the 
performance of the quantized MIMO system employing the 
Bayes-optimal JCD estimator can be fully characterized by 
the equivalent scalar Gaussian channels (18) and (19). For 
example, the achievable rate under the separate decoding (SD) 
is the mutual information between Y r/ x ( and X t for the scalar 
Gaussian channel (19). Note that in contrast to joint detection 
and decoding, the SD involves the joint multiuser detection 
followed by a bank of independent decoders. Also, the corre¬ 
sponding MSE w.r.t. H can be evaluated through the scalar 
Gaussian channel (18). Specifically, if the signal is drawn 
from a quadrature phase shift keying (QPSK) constellation, 
the corresponding bit error rate (BER) reads 

Pe = Q(V^), (24) 


where Q(x) = D z is the Q function, the MSE w.r.t. 
payload data X 2 is given by 

mse y 2 = 1 - J D^rtanh (qx 2 + , (25) 

a 2 

and the corresponding MSE w.r.t. H is mse# = 1+cr 2 ^ H ■ 

If the channel matrix H is perfectly known, the t-phase 
is not required so = P and /3i = 0. Since there is only 
one phase in X, we omit the phase indices (t) from all the 
concerned parameters in this case. Because H is perfectly 
known, we set mse# = 0. Plugging this into (20c), we 
immediately obtain q H = ch = crj[, which leads to qnqx = 
Cnqx, and c H cx - qnQx = c H (cx - qx) = c H mse x . It 
turns out that the equivalent signal-to-interference-plus-noise 
ratio (SINR) of the scalar Gaussian channel (19) is given by 


q x = ac H 



Du 


^6 (y/C H qxV ) J ’ 


(26) 


and the asymptotic MSE w.r.t. X is given by msex = E{|X — 
E{X\Yg x }\ 2 }. If QPSK is used, the MSE in (25) together with 
qx in (26) agree with [6, (7) & (8)]. More precisely, in [6], 
the real-valued system with BPSK signal is considered. In this 
case, \f2r\y in our paper should be replaced by r b . 


V. Numerical Results 

To verify the accuracy of our analytical results, we compare 
the analytical BER expression (24) with that obtained by 
simulations (performed by the GAMP-based ICD algorithm) 
under the quantized MIMO system with QPSK constellation. 




Fig. 2. BER versus SNR= l/ojj, for QPSK constellations. In the results, the 
JCD estimation scheme is used under the settings with a) perfect CSIR and b) 
no CSIR. Curves denote analytical results and markers denote Monte-Carlo 
simulation results achieved by the GAMP-based JCD algorithm. 


The simulation results are obtained by averaging over 10,000 
channel realizations. The parameters of the system are set as 
follows: K = 50, N = 200, 7\ = 50, and T 2 = 450. The 
pilot sequences of length T) are randomly generated. In all 
the following simulations, we use the typical uniform quantizer 
with the quantization step-size A = y/0.25. Note that we do 
not optimize the quantization step-size but select a good one 
for general scenarios. We leave the related issue to our future 
work. Figure 2 shows the corresponding BERs results for the 
cases of 1) perfect CSIR and b) no CSIR. We observe that 
the analytical BER expression (24) generally predicts well the 
behavior of the GAMP-based JCD algorithm. For the case with 
no CSIR, the GAMP-based JCD algorithm cannot work as well 
as that predicted by the analytical result at low SNRs. This 
would be because the GAMP-based JCD algorithm is only an 
approximation to the Bayes-optimal JCD estimator. This gap 
has motivated the search for other improved estimators in the 
future. In addition, from Figure 2, we see that the performance 
degradation due to 3-bit quantization is small. For instance, if 
we target the SNR to that attained by the unquantized system 
with perfect CSIR at BER= 10~ 3 , the 3-bit Bayes-optimal 
JCD estimator only incurs a loss of 1.19 dB. Even with 2-bit 
quantization, the loss of 2.8 dB remains acceptable. 

Comparing Figures 2(a) and 2(b), we see that the loss due to 
no CSIR is small for the proposed JCD estimator. Therefore, 
it is of interest to evaluate the improvement due to the JCD 
estimation. Following the same system parameters as before. 
Figure 3 compares the BERs under the conventional pilot- 
only scheme and the proposed JCD estimation scheme. For 
the pilot-only scheme, we adopt the receiver stmcture of [8], 
which employs the least squares (LS) channel estimate for 
the quantized MIMO system. However, unlike [8], we then 
employ the GAMP algorithm for payload data detection based 
on the estimated channel. Therefore, the BERs of the pilot- 
only scheme shown in Figure 3 are expected to be better than 





























Fig. 3. Average BER versus SNR for QPSK constellations. In the results, 
the GAMP-based JCD algorithm and pilot-only scheme are used. Plots are 
based on Monte-Carlo simulation results. 




(a) (b) 

Fig. 4. The achievable rates as functions of a = N/K for a) the Bayes- 
optimal JCD estimator and b) the pilot-only scheme. /3 = 10, 3 1 = 1, 
itJ, = 10 _1 , and Xy ~ A/c(0,1). 


that employing suboptimal criteria [8], Even so, as can be 
seen from Figure 3, the JCD estimation still shows a large 
improvement over the pilot-only scheme. 

Finally, we compare the achievable rates as functions of 
the antenna ratio a = N/K for the Bayes-optimal JCD esti¬ 
mator and the pilot-only scheme under different quantization 
precisions in Figure 4. Note that unlike the QPSK signals 
used in pervious simulations, we consider Gaussian signal, 
i.e., X 2 ~ A/c(0,1), in this experiment. We observe that the 
achievable rates of all the quantization precisions increase as 
the receive antenna numbers even for the 1-bit receivers. This 
implies that the use of high-order modulation schemes is also 
possible in 1-bit MIMO systems, which shares the same view 
as [5]. This property is quite different from the quantized SISO 


system, where the achievable rate is always upper bounded 
by B bits in a B-bit receiver [4]. If we fix the achievable 
rate to 7 bps/Hz, the number of receive antennas for the 3-bit 
Bayes-optimal JCD estimator is only about |j!p| « 1.4 times of 
the unquantized Bayes-optimal JCD estimator (the benchmark 
receiver), while even with unquantized receivers, the number 
of receive antennas for the pilot-only scheme requires about 
« 4.3 times of the benchmark receiver. The penalty 
of increasing 1.4 times of antenna numbers with very low 
precision ADCs seems to be quite acceptable. 


VI. Conclusion 

We have developed a framework for studying the best pos¬ 
sible estimation performance of the quantized MIMO system. 
In particular, we used the Bayes-optimal inference for the JCD 
estimation and realized this estimation by applying the BiG- 
AMP technique. Additionally, the asymptotic performances 
(e.g., MSEs) w.r.t. the channels and the payload data are 
derived in the lareg-system limit. A set of Monte-Carlo sim¬ 
ulations was conducted to illustrate that our analytical results 
provide an accurate prediction for the performances of the 
Bayes-optimal JCD estimator. The numerical results have also 
revealed that the JCD estimation scheme provides tremendous 
improvement over the conventional pilot-only scheme. 


Appendix A: Proof of Proposition 1 


As stated in Section IV, the MSEs of interest are saddle 
points of the average free entropy (16). However, direct cal¬ 
culation is very difficult. Thus, we resort to the replica method 
by computing the replicate partition function Ey {P t (Y)} in 
(17), which with the definition of (10) can be expressed as 


{P r (Y)} = £ u . x | f dY f[ P out (Y 


< 2—0 


Z (a) 


(27) 


where we define ZW = Hwith and X< a ) 
being the a-th replica of H and X respectively, and the 
notations X = {X^ a \Va} and K, = {H(“),Va}. Here, 
(H(“),X( a )) are random matrices taken from the distribution 
(P H , Px) f° r o, = 0,1,... ,t. In addition, f dY denotes the 
integral w.r.t. a discrete measure because the quantized output 
Y is a finite set. Next, we will focus on calculating the 
right-hand side of (27), which can be done by applying the 
techniques in [14,20,21] after additional manipulations. 

First, in order to average over ('H,X), we introduce two 
(r + 1) x (r + 1) matrices Q H = [Q^] and Q V[ = [Qx t \ 
whose elements are defined by Q°j// = h^(h^)t/.ff and 
Qx = /K. Here, hif’ denotes the nth row vector 

of H(“), and Xj denotes the jth column vector of X( a ) 
corresponding to phase block t, and Tt for t = 1,2 represents 
the set of all symbol indices in phase block t. The definitions 





















( 30 ) 


of Q// and Q X/ are equivalent to 

r N T 

!= / n n 

n=10<a<b 

1 = ffi n ri 4 ((d 01 )'*?’ - k «*) 

J t =1 j£% 0<a<b 

where £(•) denotes Dirac’s delta. Let Qx — {Qx f >Vt} and 
Z = {Z^“(,Va}. Inserting the above into (27) yields 

Ey{P t (Y)} = J e K2gM d^\Q H )d^\Q x ), (28) 


where 



Then using the Fourier representation of the 6 function and 
computing the integrals by the saddle point method, we attain 

-^Ey{P t (Y)}= Extr _ {$< T >} (29) 

K Q H ,Qx,QH:Qx 


with 



where Extra;{/(a;)} represents the extreme value of /( x) 
w.r.t. X, Q H = [Q^f ] G C (r+l)x(r+l) an( j Q x A 

{Qx t = [QxJ e C( T+ 1 ) x ( T+ 1 ),Vc,f}. According to 
(17), the average free entropy turns out to be $ = 

lim-r-j-o f Extr Q ff , SA .,Q ff ,Q x { < J >lr) }- 

The saddle points of <h ! ' r) can be obtained by seeking the 
point of zero gradient w.r.t. {Qh, Qx, ■ Qh, Q.v t }- However, 
in doing so, it is prohibitive to get explicit expressions about 
the saddle points. Therefore, we assume that the saddle points 
follow the RS form [22] as Qh = ( ch — Qh) I + <z_hT. 
Qh = (ch ~ Qh) I + <7hY Qa 4 = (cx, — qx t ) I + Qx t 1, 
Q v, = (c x , — Qx t )I + Qx, 1. where I denotes the identity 
matrix and 1 denotes the all-one matrix. In addition, the 
application of the central limit theorem suggests that the 
z n j = [z^) z^lj ■ ■ ■ z^j] T are Gaussian random vectors with 
(t + 1) x (r + 1) covariance matrix Qz t - If j G 7t, then the 


(a, 6 )th entry of Qz t is given by 



Therefore, we set Q Zt = {c H c Xt -qHqx t )I+qHqx t l, which 
is equivalent to introduce to the Gaussian random variable z n j 
for j G Tt as 

Znj = sJcnCXt ~ qHqXt u (a) + sjQnQXt ( 31 ) 

for a = 0 , 1 ,... r, where u!'" 1 and v are independent standard 
complex Gaussian random variables. 

Substituting these RS expression into leads to the RS 
expression of . With the RS, we only have to determine the 
parameters {c H , Qh, c Xt , Qx, ,c H ,qH , c Xt , Qx,}, which can be 
obtained by equating the corresponding partial derivatives of 
$(D to zero. In doing so, as r —> 0, we get that Ch = 0, 
c Xt = 0, c H = E{|i7| 2 }, c Xt = E{|Xi| 2 }, and the other 
parameters {qn, Qx t , qH, qx t } are given by (20) in Proposition 
1 . Let mse# = Ch — Qh and msex t = Cx, — q x , . After taking 
these into account, we obtain the RS expression of the average 
free entropy as 

Du (Vt) log *6 (Ft) j 
2 

- aI(H-, Z h \qh) - Z Xt \qx t ) 

t -i 

2 

+ o:(ch — qH)qn + — Qx t )qx t ■ ( 32 ) 

t=i 

where we have defined V t = ^/qnqx t v, H/ 5 (V*) is given by 
( 22 ), and the notation I(X, Z\q) is used to denote the mutual 
information between X and Z with a Gaussian scalar channel 
Z = y/qX + W and W ~ A/”c(0,1). Note that the parameters 
{QH,qx t ,qH,qx t } given by ( 20 ) are saddle points of ( 32 ). 
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